Automated Pattern Mining with a Scale Dimension

نویسندگان

  • Jan M. Zytkow
  • Robert Zembowicz
چکیده

An important but neglected aspect of automated data mining is discovering patterns at different scale in the Sitllle data. 8C& DhVS the r& ar?&ZOUS to er21X, It can be used to focus the search for patterns on differences that exceed the given scale and to disregard those smaller. We introduce a discovery mechanism that ap plies to bi-variate data. It combines search for maxima and minima with search for regularities in the form of equations. Groups of detected patterns are recursively searched for patterns on their parameters. If the mechanism cannot find a regularity for all data, it uses patterns discovered from data to divide data into subsets, and explores recursively each subset. Detected patterns are subtracted from data and the search continues in the residua. Our mechanism seeks patterns at each scale. Applied at many scales and to many data sets, it seems explosive, but it terminates surprisingly fast because of data reduction and the requirements of pattern stability. We walk through an application on a half million datapoints, showing how our method leads to the discovery of many extrema, equations on their parameters, and equations that hold in subsets of data or in residua. Then we analyze the clues provide by the discovered regularities about phenomena in the environment in which the data have been gathered. Automated data mining: the role of

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Data Mining approach for forecasting failure root causes: A case study in an Automated Teller Machine (ATM) manufacturing company

Based on the findings of Massachusetts Institute of Technology, organizations’ data double every five years. However, the rate of using data is 0.3. Nowadays, data mining tools have greatly facilitated the process of knowledge extraction from a welter of data. This paper presents a hybrid model using data gathered from an ATM manufacturing company. The steps of the research are based on CRISP-D...

متن کامل

Spatio-temporal QoS Pattern Analysis in Large Scale Internet Environment

For enhanced Quality of Service (QoS) provision of multimedia applications in Internet environment, there is a need of data mining tools supporting the automated analysis of QoS behaviour and dependencies for the purpose of modelling and forecasting, QoS planning and anomaly detection. This paper presents a data mining technology for spatio-temporal QoS pattern analysis including automated extr...

متن کامل

Exploring multi-dimensional sequential patterns across multi-dimensional multi-sequence databases

Existing multi-dimensional sequential pattern mining methods only discover multi-dimensional sequential pattern in databases involving one sequential dimension. Since multi-dimensional sequential patterns may exist in databases containing more than one sequential dimension, in this paper, we present algorithm PSeq-MIDim for mining multi-dimensional sequential patterns from multiple sequential d...

متن کامل

Statistical Proof Pattern Recognition: Automated or Interactive?

In this paper, we compare different existing approaches employed in data mining of big proof libraries in automated and interactive theorem proving.

متن کامل

Automated detection of coronavirus disease (COVID-19) by using data-mining techniques: a brief report

Background: The clinical field has vast sick data that has not been analyzed. Discovering a way to analyze this raw data and turn it into an information treasure can save many lives. Using data mining methods is an efficient way to analyze this large amount of raw data. It can predict the future with accurate knowledge of the past, providing new insights into disease diagnosis and prevention. S...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996